CBoms by morzan1001 · Pull Request #51 · morzan1001/Dependency-Control

morzan1001 · 2026-04-21T20:01:29Z

Since there was a request for this in #48 , I've started making crypto dependencies and settings trackable via CBOMs. This pull request contains the first backend changes; a second one will follow.

…attern

Add crypto_policy service package with four YAML seed files (NIST SP 800-131A, BSI TR-02102, CNSA 2.0, NIST PQC) and a seeder that idempotently upserts the system policy only when the stored version is below CURRENT_SEED_VERSION.

Add CryptoAssetRepository.ensure_indexes(), CryptoPolicyRepository.ensure_indexes(), and seed_crypto_policies() calls to the application startup handler so that crypto collection indexes and the built-in seed policy are ready on boot.

Implements Task 3.1 of CBOM Phase 1 plan with three read-only endpoints: - GET /api/v1/projects/{project_id}/crypto-assets: list with pagination and filtering - GET /api/v1/projects/{project_id}/crypto-assets/{asset_id}: get single asset - GET /api/v1/projects/{project_id}/scans/{scan_id}/crypto-assets/summary: aggregate by type All endpoints require project membership and return 403 if user lacks access. Pagination limit capped at 500 to prevent resource exhaustion. Filtering supports asset_type, primitive, and name_search (case-insensitive regex). Test coverage includes repository integration tests verifying pagination, filtering, and summary aggregation logic.

…nd findings view

* feat: add Phase 2 crypto finding types (cert lifecycle + weak protocol) * feat: add expiry-thresholds and cipher-weakness fields to CryptoRule * feat: add IANA TLS cipher-suite catalog + loader * feat: seed cert-lifecycle + protocol-cipher default rules, bump seed version * feat: add CertificateLifecycleAnalyzer with 7 checks * feat: add ProtocolCipherSuiteAnalyzer using IANA catalog * test: include Phase-2 types in crypto-values set for collision test * feat: register Phase-2 analyzers in registry * test: integration tests for Phase-2 cert + cipher analyzers * feat: add analytics response schemas (HotspotEntry, TrendSeries, ScanDelta) * feat: add analytics:global permission to admin preset * feat: add analytics scope resolver with project/team/global permission gating * feat: add TTLCache for analytics query results * feat: denormalize scan_created_at into findings with backfill migration * feat: add CryptoHotspotService with 5-dimensional grouping * feat: add CryptoTrendService with 8 time-bucketed metrics * feat: add scan-delta service keyed on (name, variant, primitive) * feat: expose /api/v1/analytics/crypto endpoints (hotspots, trends, scan-delta) * perf: add indexes supporting analytics aggregation queries * feat(frontend): add crypto-analytics types and API client * feat(frontend): add AnalyticsViewSwitcher with URL-synced state * feat(frontend): add HotspotTable primary view * feat(frontend): add HotspotHeatmap view * feat(frontend): add HotspotTreemap and HotspotBarChart views * feat(frontend): add TrendsTimeSeriesChart with metric label map * feat(frontend): add ScanDeltaView + ProjectScans delta trigger * feat(frontend): add CryptoHotspotsPage with view switcher * feat(frontend): add CryptoTrendsPage and wire sub-tabs into CryptographyTab * feat(frontend): add CrossProjectNetworkView for team/global scopes * feat(frontend): add team crypto-analytics sub-tab * feat(frontend): add admin global crypto analytics + Metabase deep-link * feat: add MCP tools for crypto hotspots, trends, and scan-delta * docs: document VITE_METABASE_CRYPTO_DASHBOARD_URL env var * revert: remove admin + team crypto analytics pages and Metabase references Crypto analytics stay project-scoped in the Cryptography tab. The admin /settings/crypto-analytics page, the team-analytics sub-tab in TeamMembersDialog, and the CrossProjectNetworkView component are removed. All Metabase deep-link references are stripped (no such dashboard exists). Backend analytics endpoints for team/global scopes remain — they serve MCP tool calls and are ready for future UIs if needed. * refactor: move crypto analytics into main /analytics page Removes the Cryptography top-level tab from ProjectDetails. All crypto analysis views (Hotspots, Trends, Inventory, Findings) now live in the shared /analytics page as a new Cryptography tab, matching the pattern of VulnerabilityHotspots and other cross-project analyses. Backend: adds a 'user' scope to ScopeResolver that resolves to all projects the current user has access to (no elevated permission needed). Per-project 'Crypto Policy' tab on ProjectDetails is preserved - that is configuration, not analytics. * fix: include 'user' in crypto analytics scope regex * fix(tests): correct $sum accumulator init in _FakeDb aggregate helper * chore(frontend): remove dead CryptoAssetTable component * refactor(frontend): simplify crypto tab permission check to analytics:read * refactor(frontend): drop unused 'network' variant from AnalyticsView * fix: extend 'user' scope support to HotspotResponse schema and _FakeCollection.find projection arg

* feat: add Phase 2 crypto finding types (cert lifecycle + weak protocol) * feat: add expiry-thresholds and cipher-weakness fields to CryptoRule * feat: add IANA TLS cipher-suite catalog + loader * feat: seed cert-lifecycle + protocol-cipher default rules, bump seed version * feat: add CertificateLifecycleAnalyzer with 7 checks * feat: add ProtocolCipherSuiteAnalyzer using IANA catalog * test: include Phase-2 types in crypto-values set for collision test * feat: register Phase-2 analyzers in registry * test: integration tests for Phase-2 cert + cipher analyzers * feat: add analytics response schemas (HotspotEntry, TrendSeries, ScanDelta) * feat: add analytics:global permission to admin preset * feat: add analytics scope resolver with project/team/global permission gating * feat: add TTLCache for analytics query results * feat: denormalize scan_created_at into findings with backfill migration * feat: add CryptoHotspotService with 5-dimensional grouping * feat: add CryptoTrendService with 8 time-bucketed metrics * feat: add scan-delta service keyed on (name, variant, primitive) * feat: expose /api/v1/analytics/crypto endpoints (hotspots, trends, scan-delta) * perf: add indexes supporting analytics aggregation queries * feat(frontend): add crypto-analytics types and API client * feat(frontend): add AnalyticsViewSwitcher with URL-synced state * feat(frontend): add HotspotTable primary view * feat(frontend): add HotspotHeatmap view * feat(frontend): add HotspotTreemap and HotspotBarChart views * feat(frontend): add TrendsTimeSeriesChart with metric label map * feat(frontend): add ScanDeltaView + ProjectScans delta trigger * feat(frontend): add CryptoHotspotsPage with view switcher * feat(frontend): add CryptoTrendsPage and wire sub-tabs into CryptographyTab * feat(frontend): add CrossProjectNetworkView for team/global scopes * feat(frontend): add team crypto-analytics sub-tab * feat(frontend): add admin global crypto analytics + Metabase deep-link * feat: add MCP tools for crypto hotspots, trends, and scan-delta * docs: document VITE_METABASE_CRYPTO_DASHBOARD_URL env var * revert: remove admin + team crypto analytics pages and Metabase references Crypto analytics stay project-scoped in the Cryptography tab. The admin /settings/crypto-analytics page, the team-analytics sub-tab in TeamMembersDialog, and the CrossProjectNetworkView component are removed. All Metabase deep-link references are stripped (no such dashboard exists). Backend analytics endpoints for team/global scopes remain — they serve MCP tool calls and are ready for future UIs if needed. * refactor: move crypto analytics into main /analytics page Removes the Cryptography top-level tab from ProjectDetails. All crypto analysis views (Hotspots, Trends, Inventory, Findings) now live in the shared /analytics page as a new Cryptography tab, matching the pattern of VulnerabilityHotspots and other cross-project analyses. Backend: adds a 'user' scope to ScopeResolver that resolves to all projects the current user has access to (no elevated permission needed). Per-project 'Crypto Policy' tab on ProjectDetails is preserved - that is configuration, not analytics. * fix: include 'user' in crypto analytics scope regex * fix(tests): correct $sum accumulator init in _FakeDb aggregate helper * chore(frontend): remove dead CryptoAssetTable component * refactor(frontend): simplify crypto tab permission check to analytics:read * refactor(frontend): drop unused 'network' variant from AnalyticsView * fix: extend 'user' scope support to HotspotResponse schema and _FakeCollection.find projection arg * feat: add CRYPTO_KEY_MANAGEMENT FindingType for crypto-misuse SAST rules * feat: tag crypto-misuse-* SAST rules as CRYPTO_KEY_MANAGEMENT findings * feat: wire crypto-misuse rules into SAST templates + standalone scan * feat: add PolicyAuditEntry model + PolicyAuditAction enum * feat: add compute_change_summary for policy audit diffs * feat: record_policy_change service + crypto_policy.changed webhook event * feat: PolicyAuditRepository with insert/list/get/delete-older-than * feat: write PolicyAuditEntry on every crypto-policy mutation Hook record_policy_change into PUT /crypto-policies/system, PUT /projects/{id}/crypto-policy, DELETE /projects/{id}/crypto-policy, and the seeder. Accept optional comment field on PUT bodies. Add owner_auth_headers_proj_p2 fixture to conftest. * feat: policy audit list/detail/revert/prune endpoints Add policy_audit router with GET/POST/DELETE endpoints for system and project audit history. Register router in main.py under /api/v1 prefix. * feat: policy audit retention + startup pruning Add prune_old_audit_entries() service (honours POLICY_AUDIT_RETENTION_DAYS), wire PolicyAuditRepository.ensure_indexes() and the retention pruner into app startup. Add distinct() to _FakeCollection. Fix prune endpoint datetime query parsing to tolerate URL-encoded '+' timezone offset. * feat: add compliance reporting schemas + enums * test: extend fake DB count_documents to support $in and comparison operators * feat: ComplianceReport model + repository * feat: compliance framework base + default evaluator + EvaluationInput * feat: NIST SP 800-131A + BSI TR-02102 + CNSA 2.0 compliance frameworks * feat: FIPS 140-3 + ISO 19790 algorithm-level compliance frameworks * feat: ComplianceReportEngine orchestrator (placeholders for renderer pipeline) * feat: JSON + CSV renderers for compliance reports * feat: SARIF 2.1.0 renderer for compliance reports * feat: PDF renderer via WeasyPrint + renderer registry * feat: engine _gather_inputs + _render + _store_artifact implementations * feat: compliance report REST endpoints + background generation * test: format-coverage + expiry integration tests for compliance reports * feat: PQC migration mappings YAML + loader + schemas * feat: PQC migration priority scoring (exposure/weakness/deadline/count) * feat: PQC migration generator + REST endpoint + drift sentinel * feat: PQC-migration-plan meta-framework for compliance-report export * feat: four Phase-3 MCP tools (PQC plan, reports list, audit, framework summary) * feat(frontend): Phase 3 types + API clients for compliance / PQC / audit * feat(frontend): PQC migration panel + table + detail drawer * feat(frontend): compliance reports panel with polling + download * feat(frontend): expose PQC + Compliance tabs in Crypto analytics * feat(frontend): policy audit timeline + diff view + revert dialog * feat(frontend): mount policy audit timelines on admin + project pages * fix(compliance): PQC framework async evaluation to avoid asyncio.run in running loop * fix(compliance): remove FIPS ECDSA phantom control (never matched findings) * fix(compliance): CSV evidence_count now sums both findings and asset bom_refs * test(compliance): assert engine passes real EvaluationInput to framework.evaluate * feat(compliance): retention sweeper deletes expired reports + GridFS artifacts * test: consolidate fake-DB range operator implementations ($gte/$lte/$gt/$lt) * refactor(compliance): remove unused imports in frameworks/base.py * refactor(pqc): expose clear_mappings_cache() for tests * fix(frontend): keyboard-accessible rows in migration + compliance tables * ruff format * refactor(analytics): extract useAnalyticsView hook to its own module Resolves react-refresh/only-export-components: the switcher module now exports only its component + type; the hook moves to a sibling file and is imported by both the switcher and the parent tab. * fix(findings): prefer-const + derive scroll target without ref-read during render Replaces an IIFE that read hasScrolledRef.current while mapping rows with a pure findIndex up-front. hasScrolledRef is now only consulted inside useEffect. Also tightens 'let res' to 'const res'. * fix(project): remove setState-in-effect from AnalyzerSettingsDialog Drops the reset-on-open useEffect; parent now passes key={openSettingsAnalyzer} so the dialog remounts and re-initializes its local state from props — the React-recommended pattern for resetting state when an identifier changes. Reference: https://react.dev/learn/you-might-not-need-an-effect * style: fix ruff findings (unused imports + variables) Auto-fixes from ruff --fix removed 40 unused imports across app/ and tests/. Manually removed 4 unused local variables in chat/tools.py (project_repo, finding_repo, scan_repo, waiver_repo) along with their now-unused repository imports. * fix(types): resolve all mypy errors (209 → 0) Add type annotations across 39 backend files. No behavior change. Key patterns fixed: - FastAPI endpoints: typed `db: AsyncIOMotorDatabase` and `current_user: User` - MongoDB aggregation pipelines annotated as `list[dict[str, Any]]` - Scope resolver calls cast str → Literal["project","team","global","user"] - Variable name collisions renamed (latest_rows, hotspots_pipeline) - weasyprint import: # type: ignore[import-untyped] (no stubs published) - Framework protocol conformance: ClassVar annotations where needed Verified: ruff clean, mypy 0 errors, pytest 274 passed (+2 pre-existing live-Mongo failures unchanged). * fix(webhooks): expose all 6 webhook events in subscription UI The webhook subscription dialog only surfaced scan_completed and vulnerability_found while the backend has been accepting 6 event types (plus the upcoming pqc_migration_plan.generated). Expand the static event catalogue to include: - analysis_failed - crypto_asset.ingested - crypto_policy.changed - compliance_report.generated - pqc_migration_plan.generated Each entry now carries a user-friendly label and description, and the checkbox list has a max-height + overflow so the dialog stays compact. Adds a smoke test that opens the dialog and asserts every event label renders. * feat(frontend): delete-report and prune-audit UIs for Phase-3 admin flows The deleteReport() and pruneSystemAudit()/pruneProjectAudit() API client functions were previously unreachable from the UI. ReportDetailDrawer now exposes a destructive "Delete report" button behind a confirmation dialog. On success it invalidates the compliance-reports query and closes the drawer; on failure (e.g. 403 for non-owners) the backend error message is surfaced via a toast. PolicyAuditTimeline grows a "Prune old entries" button in the header (gated on canRevert, matching the existing admin gate for revert). The new PruneAuditDialog prompts for a cutoff date (default: 180 days ago), warns about the destructive nature of the operation, and calls pruneSystemAudit/pruneProjectAudit. Backend-enforced min-cutoff errors are rendered verbatim in the toast. Success toast reports the deleted count. Adds smoke tests for both flows. * feat(compliance): permission-gated scope + scope_id in NewReportDialog The dialog always posted scope="user" so admins could not create team/project/global reports from the UI. Add a Scope select with user (default), project, team and — when the caller has system:manage or analytics:global — global. For project/team scope, require a non-empty scope_id via an additional text input; the field validates client-side and the server-returned error message is surfaced via a toast on backend validation failures. Extends the existing smoke test to cover the default user-scope payload and the permission gating on the Global option. * fix(crypto-policy): expose all 12 finding-types in policy editor The finding_type select was limited to crypto_weak_algorithm, crypto_weak_key and crypto_quantum_vulnerable, but the backend FindingType enum defines 12 crypto_* values (Phase-2 certificate lifecycle + Phase-2 protocol weakness + Phase-3 key-management hygiene). Extends the CryptoFindingType union with the missing 9 values and turns FINDING_TYPES into {value, label} entries with user-friendly labels so the dropdown is readable instead of showing raw enum strings. * fix(audit): wire in-app notifications on policy changes The previous implementation imported `app.services.notifications.service` as a module and guarded every call with `hasattr`, so the methods on the `NotificationService` instance were never reachable and notifications silently never fired. `notify_users_with_permission` also did not exist at all. - Add `NotificationService.notify_users_with_permission` that queries active users holding any of the required permissions and fans out through the existing `notify_users` path. - Import the module-level `notification_service` singleton correctly from `app.services.notifications.service` and drop the hasattr guards. - System-scope policy changes now notify users holding `system:manage` or `analytics:global`; project-scope changes fetch the Project via ProjectRepository and call `notify_project_members` with the object. - Best-effort semantics preserved by the existing try/except around `_notify_relevant_users` in `record_policy_change`. * feat(pqc): fire pqc_migration_plan.generated webhook on plan generation The Phase-3 webhook spec lists three new events but the endpoint that serves PQC migration plans never fired the one for it. Now that plans are first-class outputs of the compliance stack, consumers need a signal to pick them up. - Add `WEBHOOK_EVENT_PQC_MIGRATION_PLAN_GENERATED` to constants and include it in `WEBHOOK_VALID_EVENTS`. - Wire a `BackgroundTasks` dispatch in the PQC endpoint so the webhook fires after the response is sent; failures are logged but never surfaced to the caller (mirrors compliance_reports._run_and_webhook). - Payload carries scope, scope_id, total_items, status_counts and mappings_version so downstream consumers can evaluate relevance without refetching the plan. * fix(compliance): CSV renderer propagates framework disclaimer FIPS/ISO frameworks set a disclaimer (e.g. "algorithm-level conformance only; CMVP module validation out of scope") that the PDF/JSON/SARIF renderers surface but CSV silently dropped. A bare CSV export from a FIPS report therefore read like a full certification pass. - Prepend the disclaimer, framework identity and generation timestamp as `#`-prefixed comment lines before the header row. Excel, `pandas.read_csv (comment='#')` and most SIEM ingesters skip these automatically, so the column layout downstream consumers rely on stays intact. - Add unit tests covering both branches (with disclaimer / without). * fix(audit): handle pagination boundary in policy diff view entries[idx + 1] is undefined when idx is the last loaded entry and the query was capped at 50 rows. The diff view then rendered every current rule as "added", which is wrong when there are more versions below the window. Detect the boundary via (isLast && !previous && entries.length >= PAGE_SIZE && entry.version > 1) and, in that case, render a "Previous version is beyond the loaded window" hint alongside a read-only JSON snapshot of the current entry. Genuine first-version entries (version === 1 or entries.length < PAGE_SIZE) still get the regular diff view with previous === undefined handled by PolicyDiffView as before. Adds a dedicated test that seeds a 50-entry full page with versions 100..51 and asserts the truncation hint renders when the oldest entry is expanded. * fix(normalizer): detect crypto-misuse rules regardless of semgrep path prefix Semgrep / OpenGrep may emit check_id either as the bare rule name (`crypto-misuse-ecb-mode-python`) or as a dotted path when rules are loaded from a filesystem (`rules.crypto-misuse.ecb-mode.crypto-misuse-ecb- mode-python`). The previous `startswith` against the full string only caught the first form, so crypto-misuse findings from a path-based Semgrep invocation silently fell through as generic SAST and never received the CRYPTO_KEY_MANAGEMENT tag the compliance pipeline expects. Inspect the final dot-separated segment too, with the original `startswith` kept as a fast-path. Add regression tests for the nested- path shape plus negative cases. * fix(analytics): invalidate cache on policy and waiver mutations Hotspots, crypto trends and PQC migration plans are all computed on top of the current crypto policy rules and the per-finding `waived` flag. The 5-minute TTLCache in `app.services.analytics.cache` had no hook on either mutation path, so admins saw up-to-five-minutes-stale results after a rule toggle or waiver approval — the exact windows where freshness matters most. - `record_policy_change` now flushes the process-level analytics cache after the audit insert (still best-effort; failure never blocks the write). - Waiver create/update/delete endpoints flush the cache synchronously before the background stats recalculation fires, closing the hole where a waived finding would still show as active in the next hotspots request. - Unit test pins the behaviour by patching `get_analytics_cache` and asserting `clear()` fires during `record_policy_change`. * fix(audit): enforce minimum prune cutoff to preserve forensic history Before this change a system-manage admin could pass `?before=<yesterday>` to DELETE /crypto-policies/system/audit and wipe the entire policy change history in a single request — exactly the kind of action the audit log exists to catch. - Reject cutoffs newer than `now - POLICY_AUDIT_MIN_PRUNE_DAYS` with a 400 and a clear message; default is 90 days of forensic retention, configurable via the env var. Invalid / non-positive env values fall back to the default so a typo cannot relax the guard. - Apply the same check on the per-project prune endpoint. Related fix in the same commit: both `revert_project_policy` and `prune_project_audit` called `check_project_access(required_role="owner")`, but "owner" is not in `PROJECT_ROLES = [viewer, editor, admin]`. The helper raised `ValueError` on every invocation, returning a 500 to admins legitimately trying to revert or prune. Normalised to required_role="admin". * refactor(compliance): tighten EvaluationInput.db type to AsyncIOMotorDatabase `EvaluationInput.db` was typed `Optional[object]` so the type system lost all knowledge of what consumers could call on it. The PQC meta-framework worked around this with a runtime `cast`; other frameworks that may need DB access in future would have had to do the same. Narrow the annotation to `Optional[AsyncIOMotorDatabase[Any]]` so consumers get proper type-checking. Drop the now-unnecessary `cast` and the `AsyncIOMotorDatabase` + `typing.cast` imports from pqc_migration_plan.py. mypy still clean. * refactor(compliance): replace magic status strings with enum values `ComplianceReportRepository.count_pending_for_user` hard-coded the strings `"pending"` and `"generating"` for its Mongo `$in` query. If `ReportStatus` ever changes value casing or spelling the repo would silently return zero without any test/type signal. Use `ReportStatus.PENDING.value` / `.GENERATING.value` instead so the source of truth stays in the enum. * refactor: use CustomAPIRouter for all Phase-3 endpoints Aligns Phase-3 endpoints with the project convention: CustomAPIRouter sets response_model_by_alias=False so responses serialize 'id' instead of '_id', matching every other endpoint in the codebase. * refactor(webhooks): unify event naming to dot-notation with backward-compat aliases Canonical names: scan.completed, vulnerability.found, analysis.failed. Legacy snake_case names (scan_completed, vulnerability_found, analysis_failed) remain accepted via WEBHOOK_EVENT_ALIASES map so existing subscriptions in MongoDB keep working without a schema migration. The dispatcher matches events using a $in query against both canonical and alias forms. Frontend WebhookManager shows canonical names in the dropdown but continues to accept and display stored legacy values. * refactor(ingest): route CBOM-Ingest through ScanManager with full metadata Align CBOM-Ingest with SBOM-Ingest: the payload now inherits BaseIngest, accepting all CI metadata fields directly (pipeline_id, commit_hash, branch, job_id, job_started_at, commit_message, commit_tag, project_url, pipeline_url, pipeline_iid, project_name, pipeline_user). ScanManager derives a deterministic scan_id (UUID5 of project+pipeline_id+commit_hash) so re-submitting the same CI run upserts instead of creating duplicates, and register_result('cbom', trigger_analysis=True) integrates the scan into the standard analysis lifecycle. Backward compatibility: legacy payloads wrapped in {scan_metadata: {...}, cbom: {...}} still validate — a before-validator folds scan_metadata.git_ref/commit_sha and friends onto the canonical BaseIngest fields. The cboms.yml pipeline template already sends the new flat shape; its metadata was previously being discarded because the old endpoint only extracted scan_metadata.* keys. Fixes H1 + H2. * feat(webhooks): fire sbom.ingested on SBOM ingest New event symmetric to crypto_asset.ingested. Payload includes scan_id, project_id, pipeline_id, commit_hash, branch, sboms_processed, sboms_failed, dependencies_count. Best-effort: webhook failures never block the ingest response. Frontend WebhookManager exposes the event for subscription. Fake-DB gains find_one_and_update so the integration test can exercise the full ingest -> ScanManager -> register_result -> webhook flow against the in-process test harness. * docs(cache): document two-cache architecture and add reset helper The codebase uses two complementary caches with distinct semantics: - app.core.cache.cache_service: async, Redis-backed, cross-pod shared. Use for external API responses (OSV, deps.dev, NPM, OIDC) where cross-pod dedup matters or upstream rate limits apply. - app.services.analytics.cache.TTLCache: sync, in-process, per-pod. Use for memoizing MongoDB aggregation output (hotspots, trends, PQC plans) where per-pod-per-TTL consistency is sufficient. Added module-level docstrings on both files explaining when to use which. Added reset_analytics_cache_for_tests() helper so tests can patch or re-initialize the in-process singleton cleanly. Addresses M4 (analytics cache strategies divergent). A full Redis migration of analytics caching is deferred: the MongoDB aggregations are cheap enough per-pod that the cross-pod dedup gain does not justify the async call-site refactor today. * feat(audit): add policy_type discriminator to PolicyAuditEntry PolicyAuditEntry gains a policy_type field ('crypto' | 'license', default 'crypto'). PolicyAuditRepository accepts policy_type in list / get_by_version / delete_older_than / count. Backward compatibility: queries for policy_type='crypto' also match entries written before the field was added (MongoDB $or with $exists: False). Legacy index (policy_scope, project_id, version) is kept alongside a new composite index that leads with policy_type. Fake-DB matcher extended to support $or/$and/$exists/$ne in a shared helper (consolidating three previously divergent implementations across _fake_match_doc, _doc_matches_query and _FakeCursor._matches). * feat(audit): record license-policy changes with change-summary Add compute_license_policy_change_summary() + record_license_policy_change() to the audit history service. License changes are persisted in the same collection as crypto-policy entries (using policy_type='license') and fire the new license_policy.changed webhook event. The version for license-policy entries is derived from the count of existing license-policy entries for the project — license policy has no explicit version column on the project doc. Also: - Generalize _dispatch_webhook + _notify_relevant_users to accept the event_type / subject_noun so both policy types share the same dispatch code. - Add backward-compat alias record_crypto_policy_change. - Extract 'No effective changes' to a module constant. * feat(projects): audit license-policy changes on project update PUT /api/v1/projects/{id} now captures the pre-update license policy, compares it to the post-update state, and — on any effective change — calls record_license_policy_change() which writes an audit entry with policy_type='license' and fires the license_policy.changed webhook. A helper _resolve_license_policy() merges the two shapes the codebase uses for license policy storage: * project.analyzer_settings['license_compliance'] (canonical, Phase 2+) * project.license_policy (legacy top-level field) Audit failures never block the project update (outer try/except + the recorder is already fail-soft internally). * feat(projects): list/get license-policy audit entries New REST endpoints under the existing policy-audit router: GET /api/v1/projects/{id}/license-policy/audit GET /api/v1/projects/{id}/license-policy/audit/{version} Both gated on viewer-level project access (reads only). Revert and prune for license-policy are deferred — revert needs a merge strategy for project.analyzer_settings that does not stomp peer settings. * refactor(analytics): unify SBOM + CBOM scope resolution via ScopeResolver - ScopeResolver._resolve_user now honours PROJECT_READ_ALL (super-user escape hatch) so the CBOM-scope semantics match the long-standing SBOM-analytics behaviour when callers pass scope='user'. - get_user_project_ids (helper used across SBOM endpoints) becomes a thin shim over ScopeResolver.resolve(scope='user') — a single code path now owns permission checking + project-id enumeration for every analytics surface. - generate_pqc_migration_plan (MCP tool) takes a user parameter and constructs its ResolvedScope via ScopeResolver instead of hand- assembling the dataclass. Matches the pattern every other tool uses. Closes H3 and L2. * feat(compliance): add SBOM-side frameworks (License Audit + CVE Remediation SLA) Two new async-only frameworks that reuse the existing compliance engine, renderers, retention, and GridFS artifact storage: * LicenseAuditFramework — evaluates project SBOM findings against the license policy (allow_strong_copyleft / allow_network_copyleft), plus a catch-all 'all components have identified licenses' control. * CveRemediationSlaFramework — checks that open vulnerabilities are fixed within platform SLAs (7 / 30 / 90 days for CRITICAL / HIGH / MEDIUM). Both registered in FRAMEWORK_REGISTRY alongside the crypto frameworks so the same /api/v1/compliance/reports endpoints / formats / webhooks apply without any further wiring. Frontend NewReportDialog exposes them in the Framework dropdown. * refactor(frontend): extract useAnalyticsList hook for shared list scaffolding New hook bundles the useQuery + isLoading + isEmpty + error pattern every analytics panel in the codebase duplicates. HotspotTable is the first consumer — VulnerabilityHotspots and other larger analytics components can migrate incrementally in follow-up PRs. The hook is generic over response + item type so it works with the crypto HotspotResponse / VulnerabilityHotspot / PQC MigrationItem shapes without coupling to any of them. +3 unit tests covering the loading, empty, and error paths. * refactor(compliance): consolidate duplicated framework helpers Extract 3 helpers to base.py (public API): status_value, extract_finding_id, build_summary, build_residual_risks. The license, cve-remediation and pqc frameworks each had their own near-identical _build_summary / _residual_risks / status_value / finding-id extraction — now all four share one implementation. Net effect: ~60 LOC removed from the 3 SBOM/PQC frameworks; behaviour unchanged (20/20 framework tests still pass). * refactor(frontend): extract useDialogState hook Tiny hook that owns the useState(false) + openDialog/closeDialog/ toggleDialog boilerplate every shadcn Dialog needs. Consuming the hook cuts 3 lines + one inline callback per Dialog owner. ComplianceReportsPanel is the first consumer. Other Dialog owners (WebhookManager, PruneAuditDialog, NewReportDialog, etc.) can migrate incrementally in follow-up commits. * refactor(crypto): IANA catalog uses live-fetch + Redis cache pattern Align with how every other external-data analyzer in the codebase handles upstream snapshots (OSV, EPSS, GHSA, deps.dev): 1. In-process memoization for the hot path 2. Redis cache_service for cross-pod deduplication (7-day TTL) 3. Live fetch from iana.org with httpx.AsyncClient when Redis is cold 4. Bundled YAML snapshot as offline fallback (air-gapped / first boot before internet is available) The one-shot backend/scripts/generate_iana_catalog.py is removed — its CSV-parse + weakness-derivation logic now lives inside the loader and runs on every registry refresh automatically. The bundled YAML stays committed as a documented, deterministic fallback (not the canonical source any more). protocol_cipher.py now awaits load_iana_catalog() in analyze() instead of loading sync in __init__; the first analysis call after a cold deploy hits iana.org and populates the shared Redis entry. * refactor(webhooks): centralise fire-and-forget pattern via safe_trigger_webhooks Five callers (cbom_ingest, ingest, pqc_migration, compliance_reports, audit/history) each wrapped webhook_service.trigger_webhooks in the same try/except + logger.warning/exception boilerplate. Extract the wrapper into webhook_service.safe_trigger_webhooks(...) — caller passes a context= label that gets included in the log message. Net effect: ~50 LOC removed from the 5 endpoint modules; exception handling is now uniform (logger.exception, never logger.warning), so log readers see consistent stack traces for every dispatch failure. Audit/history keeps an outer try/except because _dispatch_webhook constructs the payload inline — that step could raise on an unexpected PolicyAuditEntry shape, independent of the webhook delivery itself. * refactor(models): extract MongoDocument base class (shared ConfigDict) Four models (CryptoAsset, CryptoPolicy, PolicyAuditEntry, ComplianceReport) all carried an identical model_config = ConfigDict(populate_by_name=True, use_enum_values=True). Extract a MongoDocument base in app.models.types that owns this config; subclasses inherit and only declare their fields. Pydantic v2 merges model_config across the inheritance chain, so future models that need additional config (e.g. extra='allow') can still extend MongoDocument and add their own settings. * refactor(frontend): migrate two dialog owners to useDialogState PolicyAuditTimeline (PruneAuditDialog state) and ReportDetailDrawer (delete-confirm state) now use the useDialogState hook from commit 97f1752 instead of useState(false) + setX(true)/setX(false) boilerplate. Same behaviour, ~3 LOC less per consumer. * refactor(constants): extract SPDX_* identifier constants for license maps LICENSE_URL_PATTERNS and LICENSE_ALIASES both reference the same SPDX IDs over and over (Apache-2.0 5\xd7, GPL-2.0 4\xd7, AGPL-3.0 4\xd7, MPL-2.0 4\xd7, LGPL-2.1 3\xd7, GPL-3.0 4\xd7, ...). Extract module-level constants (SPDX_APACHE_2_0, SPDX_GPL_2_0, ...) so the maps reference symbols rather than free-floating string literals. Eliminates the typo risk flagged by SonarLint S1192. * refactor(frontend): WebhookManager uses useDialogState * refactor(frontend): split Recommendations.tsx into focused sub-modules The 1227-LOC monolith is now four files under analytics/recommendations/: config.ts ~245 LOC priorityConfig + typeConfig + effortConfig RecommendationCard.tsx ~696 LOC per-row rendering SummaryCard.tsx ~157 LOC summary header Recommendations.tsx ~99 LOC orchestration + data fetching (kept at the original path so consumers do not need to update imports) Behaviour unchanged. The outer Recommendations.tsx had to remain at its original path because macOS / Windows have case-insensitive filesystems and a sibling 'recommendations/' directory cannot coexist with a 'Recommendations.tsx' file in the same parent. Keeping the component definition at the outer path side-steps that constraint while still cleaning up the file size. * refactor(analytics): split 1554-LOC analytics.py into a package Old monolith is replaced by 8 focused modules under analytics/: __init__.py 22 LOC aggregate router (re-exports for main.py) _shared.py 62 LOC multi-endpoint helpers (_resolve_scan_id, _get_enrichment_info, _MSG_ACCESS_DENIED) summary.py 200 LOC /summary, /dependencies/top, /dependency-types dependencies.py 259 LOC /dependency-tree, /component-findings, /dependency-metadata risk.py 326 LOC /impact, /hotspots search.py 459 LOC /search, /vulnerability-search recommendations.py 231 LOC /projects/{id}/recommendations update_frequency.py 155 LOC /update-frequency, /comparison All URLs unchanged: app/main.py still imports 'app.api.v1.endpoints.analytics' as before because __init__.py re-exports the aggregate 'router' under the same name. No call sites elsewhere in the codebase needed updating. Endpoint-only helpers stayed co-located with their callers; only the helpers used by 2+ endpoints moved to _shared.py. * refactor(license): split 1437-LOC license.py into license_compliance package Six modules under analyzers/license_compliance/: __init__.py 18 LOC re-exports LicenseAnalyzer, LICENSE_DATABASE constants.py 582 LOC string constants, regex splitters, SEVERITY_RANK, LICENSE_INCOMPATIBILITIES, LICENSE_DATABASE, CATEGORY_STAT_KEY, lazy lowercase-mapping cache normalizer.py 161 LOC normalize_license, extract_licenses, has_spdx_expression, parse_spdx_expression compatibility.py 107 LOC check_pair_conflict, collect_component_licenses, find_license_conflicts, check_license_compatibility evaluator.py 424 LOC per-category evaluate_* (weak/strong/network copyleft), apply_transitive_adjustment, should_include_finding, create_issue analyzer.py 315 LOC LicenseAnalyzer class with orchestration + back-compat wrappers for private methods that tests in tests/test_services/test_analyzers/ test_license_analyzer.py call by name The original analyzers/license.py is now a 10-LOC re-export shim so existing 'from app.services.analyzers.license import LicenseAnalyzer' imports keep resolving. * refactor(chat): split 3022-LOC tools.py into tools package Five modules under chat/tools/, the largest two split off the TOOL_DEFINITIONS data block (~1015 LOC) and the dispatcher class: __init__.py 113 LOC re-exports public surface; pre-imports the symbols (ScopeResolver, PQCMigrationPlanGenerator, ComplianceReportRepository, PolicyAuditRepository, ComplianceReportEngine, FRAMEWORK_REGISTRY, ReportFramework, ResolvedScope) so existing patch('app.services.chat.tools.X') calls keep working _helpers.py 273 LOC module-level helpers: _clamp_limit, _clip_value, _serialize_finding_for_llm, severity bucket fns, version compare, URL injection, payload truncation, _serialize_doc definitions.py 1038 LOC pure data: TOOL_DEFINITIONS, TOOL_PERMISSIONS, get_tool_definitions() crypto_tools.py 324 LOC the 12 module-level async tool fns (list_crypto_assets, get_crypto_summary, generate_pqc_migration_plan, list_compliance_reports, list_policy_audit_entries, get_framework_evaluation_summary, ...) registry.py 1454 LOC ChatToolRegistry class (the dispatcher) — kept intact, splitting it would risk regressions in 23+ tool branches crypto_tools.py looks up patched symbols (ScopeResolver etc.) lazily off the package namespace via a tiny _pkg() helper, so existing test patches on app.services.chat.tools.X keep targeting the right attribute. The old tools.py is deleted; the package directory replaces it. * refactor(aggregation): split 1161-LOC aggregator into focused modules Break aggregator.py into an aggregation/ sub-package with one responsibility per file. ResultAggregator stays in aggregator.py and keeps thin underscore-prefixed wrappers so tests that patch private methods continue to work. - components.py: normalize_component, extract_artifact_name - cross_link.py: cross_link_pair, add_context_to_vulnerability - merging.py: SAST/vuln/findings merge helpers - quality.py: update_quality_description - scorecard.py: enrich_with_scorecard (cache passed in) - versions.py: parse_version_key, calculate_aggregated_fixed_version Old aggregator.py becomes a 10-line re-export shim for external callers. * chore(backend): trim verbose comments in aggregation package Drop module docstring boilerplate, comments that restate the next line of code, and inline 'what' annotations. Keep workaround/why comments intact. * chore(frontend): trim verbose comments and dead JSDoc Drop docstrings that restate function names, inline 'what' comments, and historical migration notes. Shorten the WebhookManager EVENT_ALIASES preamble and remove stale lingering-thought TODOs. * chore(backend): trim verbose comments across services and core Second pass covering chat tools, license compliance, analytics endpoints, core utilities, repositories, and other services. Drop docstrings that restate function names, inline 'what' annotations, section dividers, and stale migration commentary. Keep workaround/why comments intact. * fix(waivers): correct global-waiver lookups and stale UI after mutation - MCP tools queried {global: True} but the Waiver model uses project_id=None for global scope, so get_waiver_status, list_global_waivers, and get_expiring_waivers never matched any global waiver. Switch to project_id=None and add the global branch to the expiring-waivers $or filter. - Drop the {status: $ne expired} filter — only accepted_risk and false_positive exist as statuses, expiry is enforced via expiration_date alone. - get_severity_distribution and get_vuln_counts_by_components now exclude waived findings, matching the convention used by stats.py and the other MCP tools that already filter waived: $ne True. - Waiver create/update/delete mutations now also invalidate scan, analytics, and project query keys so finding lists, severity badges, and dashboard counts refresh without manual reload. * chore(backend): remove dead severity/type count repository methods get_severity_counts and get_type_counts on FindingRepository have no callers anywhere in the codebase. They also missed the waived-finding filter that the live get_severity_distribution now applies, so leaving them in place would just be a footgun for future callers. * refactor(recommendation): drop dead try/except in calculate_best_fix_version Both branches returned the same value, and parse_version_tuple cannot raise (its int() input comes from a digits-only regex match), so the guard was unreachable defensive code without a documented threat model. Inline the sort and let any future regression surface as a real error. * refactor(frontend): share extractErrorMessage via lib/errors.ts ReportDetailDrawer and PolicyAuditTimeline both carried byte-identical copies of the API-error extractor. Extract it once so future error shapes only need updating in one place. * refactor(frontend): unify date formatting via formatDate/formatDateTime Compliance, audit, crypto-analytics, and PQC views all bypassed the formatDate / formatDateTime helpers in lib/utils.ts and called toLocaleString / toLocaleDateString inline. Route them through the helpers so display formatting stays consistent and we have a single seam for future changes (i18n, fixed locale, invalid-date fallback). * chore(api): parameterise DatabaseDep with AsyncIOMotorDatabase[Any] Other definitions in init_db.py already use the parameterised form; the DatabaseDep alias was the odd one out. Aligning means a future mypy --strict pass won't have to deal with this single missing type argument across every endpoint that depends on it. * perf(compliance): drop unread fields and warn on findings cap hit _collect_findings loaded every column of up to 20k findings into memory to evaluate compliance, even though no framework reads description, scanners, found_in, aliases, or related_findings. Add a projection that excludes them and surface a warning when the cap is reached so we know when a scope is silently truncated. * feat(analytics): warn when user-scope project list is truncated The user-scope analytics path silently capped accessible projects at 10k. Surface a warning with the user id so we can detect the day a deployment grows past it instead of producing a quietly-incomplete analytics view. * perf(housekeeping): skip never-scanned projects in rescan loop The rescan scheduler iterated every project every cycle, even though projects with no last_scan_at can't be rescanned and were dropped on the first check inside the loop. Filter server-side so the cursor returns only candidates that could plausibly qualify.

The /signup endpoint accepted UserCreate, which inherited permissions, is_active, and auth_provider from UserBase and splatted them straight into the User model. An unauthenticated request could therefore set arbitrary permissions on the new account. Introduce a dedicated UserSignup schema that only exposes safe fields (email, username, password, optional notification metadata) and build the User explicitly with hardcoded permissions=[], is_active=True, auth_provider="local".

Pull in the dependabot-style version bumps from main (poetry + pnpm lock files plus pyproject/package.json constraint relaxations) and fix three useEffect-based state syncs that the upgraded eslint-plugin-react-hooks now flags via the react-hooks/set-state-in-effect rule. The fixes use the adjust-state-during-render pattern recommended by the React docs instead of suppressing the rule.

# Conflicts: # backend/poetry.lock # frontend/package.json # frontend/pnpm-lock.yaml

The previous validator used str.startswith against a tuple of allowed prefixes, which let several attacks slip through: * userinfo bypass: http://localhost@evil.com/ matched the localhost prefix but resolved to evil.com on delivery * suffix bypass: http://localhost.evil.com/ likewise matched * any RFC1918, link-local, multicast, reserved or cloud-metadata IP literal (incl. 169.254.169.254) was accepted under https:// * webhook delivery itself never re-checked the resolved IP, so a public hostname could rebind to an internal address Replace the prefix check with a urlparse + ipaddress based validator that: * forces scheme http or https (case-insensitive) * rejects empty hostnames and known cloud-metadata DNS names * allows plain HTTP only for loopback hosts * rejects IP literals in private/loopback/link-local/multicast/ reserved/unspecified ranges * gates loopback targets behind WEBHOOK_ALLOW_LOCALHOST so production can disable in-pod delivery entirely Add assert_safe_webhook_target as a delivery-time guard that resolves the hostname and rejects if any returned address falls into a blocked range, mitigating DNS rebinding. Wire it into both _send_with_retries and test_webhook; treat its ValueError as a non-retriable policy rejection.

Three fixes that together let the backend pipeline pass: - Add MongoDB 7 and Redis 7 service containers to the test job. The chat-AI assistant tests (test_chat_repository.py, test_chat_rate_limiter.py) and the crypto policy seeder tests hard-code localhost:27017 / localhost:6379 connections, but the workflow had no services declared so they always errored on connect. - Type _is_blocked_ip / _parse_ip with IPv4Address | IPv6Address instead of the internal ipaddress._BaseAddress. The internal class doesn't expose is_private / is_loopback / is_link_local / etc., so mypy reported attr-defined plus a no-any-return cascade. - Extend the FindingType enum membership test with the CRYPTO_KEY_MANAGEMENT value introduced in phase 3 — the test set was hard-coded against the pre-phase-3 enum and now diverges.

The crypto analyzers (crypto_weak_algorithm, crypto_weak_key, crypto_quantum_vulnerable, crypto_certificate_lifecycle, crypto_protocol_cipher) emit findings in the canonical Finding shape but were missing from ResultAggregator's normalizer dispatch. Their output landed in analysis_results and was silently dropped before reaching the findings collection — a CBOM scan with weak crypto would end with findings_count=0 even though the analyzers reported issues. Add a crypto normalizer that rehydrates each dict into a Finding and register it for all five crypto analyzer names.

End-to-end testing exposed three connected bugs that together caused CBOM-only scans to record zero findings even when weak crypto was present: 1. cbom_ingest never tagged the scan with scan_type="cbom". The analysis engine keys on this to force crypto analyzers into the active set and to synthesise an empty SBOM pass for CBOM-only scans, so without it neither happened. 2. The Scan model didn't declare a scan_type field, so even when the ingest path set it on the document, Pydantic stripped it on read — getattr(scan_doc, "scan_type", None) was always None. 3. _process_sbom bailed on the synthesised empty {} via "if not current_sbom", aborting before the analyzer loop ran. Switch to "is None" so empty dicts pass through, and skip the SBOM-format scanners (trivy/grype/osv/deps_dev) when no real SBOM content was resolved — they would crash on the empty dict otherwise.

Reports persist artifact_gridfs_id as a string for JSON-roundtrip friendliness. Motor's GridFS bucket APIs (open_download_stream, delete) reject string ids and raise InvalidId / KeyError, which the download endpoint catches and surfaces as 410 Gone with "Artifact storage error" — every successful report download was broken. Wrap the string in ObjectId() at the call sites in the download endpoint and the retention sweep. Update the format-coverage test fixture to mint ObjectId-shaped keys so the fake bucket lines up with production semantics.

The hotspot enrichment grouped findings by 'details.rule_id' and matched against item.key — but item.key carries the asset dimension value (e.g. 'MD5', 'RSA', 'algorithm', 'hash') from the asset aggregation, never a rule id. Every join missed and every hotspot returned finding_count=0 / severity_mix={}, even when matching findings clearly existed for the asset. Pivot the enrichment based on group_by: - name -> details.asset_name - primitive -> details.primitive - asset_type -> details.asset_type severity / weakness_tag don't have a clean per-asset path into the findings collection; leave their counts at zero rather than show junk.

A non-admin user could call PUT /users/<own-id> and pass {"permissions": ["system:manage", ...]} or {"is_active": false} in the body — the endpoint gated only on "caller has user:update OR is self" and then forwarded the entire UserUpdate payload to the repository, so any authenticated user could grant themselves arbitrary permissions. Three new gates, aligned with the project's fine-grained-permissions model (no implicit admin role): - Setting 'permissions' now requires the new user:manage_permissions capability — separating routine user edits (help-desk admins) from privilege management. - Even with user:manage_permissions, the caller cannot grant a permission they don't already hold themselves (subset rule), so a privilege manager can't promote anyone above their own ceiling. - Toggling 'is_active' on yourself is forbidden regardless of held permissions, to prevent self-lockout (or last-admin-disabled scenarios). Reproduced the exploit on a running stack with a user holding only user:read; PUT /users/<self> with {permissions: [system:manage, ...]} elevated successfully against the unfixed code. After the fix the same call returns 403 with a clear reason, and the subset rule returns 403 listing the unauthorised permissions.

The two finding-bound hotspot dimensions both ran the asset-first pipeline they couldn't satisfy: - severity grouped crypto_assets by $severity, but crypto_assets doesn't carry a severity field — the pipeline always returned an empty result. - weakness_tag mapped to $asset_type by mistake (copy-paste) and produced the same output as the asset_type grouping. Add a separate finding-first aggregation path for these two: filter crypto findings, group by severity (or by unwound details.weakness_tags), and report finding_count plus the count of distinct bom_refs as asset_count. The asset-bound dimensions (name/primitive/asset_type) keep their existing path and behaviour. Verified end-to-end against the legacy_crypto_mixed scan: group_by=severity -> 4 HIGH + 1 MEDIUM (matches 5 findings) group_by=weakness_tag -> 3 cipher-suite weaknesses surfaced (no-forward-secrecy, weak-cipher-rc4, weak-mac-sha1)

… silently normalize_crypto previously caught any Finding(**item) validation error with a bare except: continue, so analyzer output drift (a renamed field, an unexpected enum value) would silently delete findings from the scan with zero visibility. Replace with a logger.warning that includes the offending finding's id and type so operators see the drop in the backend logs and can root-cause it.

…o findings Crypto findings (crypto_weak_algorithm, crypto_weak_key, crypto_quantum_vulnerable, crypto_weak_protocol, crypto_protocol_cipher, the seven crypto_cert_* lifecycle types, and crypto_key_management) were never seen by the recommendation engine — its dispatcher had handlers for vulnerabilities/secrets/sast/iac/licenses/quality but no crypto handler. A scan with five crypto findings produced "recommendations: []" with summary["crypto_issues"] missing entirely. Add app.services.recommendation.crypto.process_crypto, six new RecommendationType enum values (REPLACE_WEAK_ALGORITHM, INCREASE_KEY_SIZE, UPGRADE_PROTOCOL, PQC_MIGRATION, ROTATE_CERTIFICATE, REPLACE_WEAK_CIPHER_SUITE), and dispatcher wiring in recommendations.py. The handler groups findings by (finding_type, asset_name) and emits a single recommendation per group with priority derived from the worst severity, effort tuned per type (cert rotation = LOW, PQC = HIGH), and per-type suggested replacements (MD5/SHA-1 -> SHA-256 or SHA-3, DES/3DES/RC4 -> AES-256-GCM, TLS<1.2 -> TLS 1.2+ AEAD, RSA<2048 -> 3072-bit, quantum-vulnerable PKE/SIG -> route to /pqc-migration plan). The endpoint summary now reports a crypto_issues counter and a crypto key in finding_counts so dashboards stop hiding crypto remediations. Verified end-to-end against the legacy_crypto_mixed scan: the five crypto findings (MD5, RSA-1024 weak-key, RSA-1024 quantum-vulnerable, TLS 1.0 weak-protocol, TLS 1.0 weak-algorithm) produce five recommendations with the expected types and severities.

list_reports and get_report previously gated only on get_current_active_user, so any authenticated user could enumerate every report's metadata across every project, team, and global scope — including scope_id, framework, and the requester's identity. The download endpoint already passed each report through ScopeResolver; the list/get pair leaked everything else. Add a shared _user_can_see_report helper that runs the same ScopeResolver(report.scope, report.scope_id) the download path uses. list_reports filters the returned page in place (the result may shrink below limit when the user has partial access; pagination of the full underlying set stays stable). get_report returns 404 instead of 403 on mismatch so callers can't probe for existence.

Iso19790Framework reused FIPS controls and replaced the prefix on the ControlDefinition.control_id, but the closure inside _make_disallowed_evaluator captured the FIPS prefix in its own ControlResult.control_id assignment. Every disallowed-category result in an ISO report therefore came back labelled FIPS-140-3-... — wrong identifiers, broken renderer mapping, and mixed framework IDs in any downstream consumer that filters by framework prefix. Extract build_disallowed_algorithm_controls(data, control_id_prefix) plus a control_id-aware _make_disallowed_evaluator factory in the FIPS module, and have ISO build its own controls via the same factory with the ISO-19790 prefix. Live verification: ISO report now emits ISO-19790-HASH_FUNCTIONS / SYMMETRIC_CIPHERS / ASYMMETRIC / RSA-MIN-2048.

The /ingest/cbom endpoint accepted bodies of arbitrary size; the 50_000-asset cap only takes effect after parse_cbom has fully deserialised the payload. An authenticated client could submit a multi-GB CBOM and force the worker to allocate it. Add a Depends() that reads Content-Length and rejects with 413 when above 25 MiB before Pydantic touches the body. Chunked uploads bypass the header check; that path is bounded by the ASGI server's own limits. Verified with a 26 MiB payload: 413 response with the byte counts in the detail.

…erns A CryptoRule with quantum_vulnerable=True and no match_name_patterns matched every PKE/SIGNATURE/KEM asset, including post-quantum primitives like ML-KEM and ML-DSA — exactly the algorithms the rule exists to recommend migrating *to*. Add a model_validator on CryptoRule that requires match_name_patterns when quantum_vulnerable=True, and drop the now-redundant pattern re-check in the matcher's quantum_vulnerable branch. The seeded NIST PQC rule already supplies patterns, so existing deployments are unaffected; user-authored rules now fail validation up front instead of silently misfiring.

…tIdentifier In CycloneDX 1.6 parameterSetIdentifier is a string parameter-set name ("P-256", "ML-KEM-1024", ...), not always a key-size integer. The parser only treated it as numeric, so any algorithm whose parameter set isn't a bare integer (every ECC and PQC primitive) had key_size_bits=None and could never trigger match_min_key_size_bits rules. Try int() on parameterSetIdentifier first (covers RSA/AES where the field is conventionally a bit count), then fall back to common custom properties (cryptography:key_size, key_size, keySize, ...). Coercion failures log at debug level so operators can find producers that emit unparseable values.

ScopeResolver._resolve_user ignores scope_id and only resolves the caller's own projects, so the previous list/get fix happily resolved another user's user-scope report — every authenticated caller could read every user-scope report. The download path has the same gate, so the artifact was reachable too. Special-case scope=='user' in _user_can_see_report: only the requested_by user (or a system:manage holder, kept as the same admin escape hatch already used by delete_report) sees the report. Reproduced the leak with two users on the live stack: admin creates a user-scope report, lowpriv now gets 404 and the report doesn't appear in lowpriv's list.

int(True) == 1 in Python, so a CBOM with parameterSetIdentifier:true would silently set key_size_bits=1 and trip every match_min_key_size_bits rule. Same hazard in the property-fallback path. Negative or zero key sizes are also nonsense for any rule that triggers on 'asset.key_size_bits < threshold'. Extract _coerce_positive_int that rejects bools (the isinstance check must come before int() because bool is a subclass of int) and any non-positive value. Use it for both parameterSetIdentifier and the property fallbacks. Verified: BoolKey, NegKey, ZeroKey -> key_size_bits=None; GoodKey('2048') -> 2048.

Iso19790Framework reached into Fips1403Framework._data, a private attribute that could be renamed without warning. Promote it to a public 'data' cached_property and have ISO read through that. Pure naming change — no behavioural difference.

_latest_scan_for_project pulled up to 1000 scans per project and filtered/sorted in memory. The status-bound $in filter and the created_at sort are both supported by the fake DB used in tests (verified) and by Motor in production, so they belong in the query. Replaces a per-project N×1000-doc transfer with a one-doc cursor; the silent drop of older scans on high-throughput projects is also gone.

Schema rules tighten over time (most recently the quantum_vulnerable-requires-patterns validator on CryptoRule). A project override authored against an older schema would propagate ValidationError out of CryptoPolicyResolver.resolve at scan time — crashing every analysis run that touches that project until an operator notices. Add validate_persisted_policies(db) that walks the crypto_policies collection at startup and logs a warning per non-validating document without raising. Hook it after seed_crypto_policies in the startup event so deployments get a single, time-of-startup signal instead of a recurring runtime crash.

The previous list_reports fix post-filtered the page after fetching, which meant pagination shrank silently when a caller had partial scope access — pages of < limit results with no signal of how many reports they couldn't see. Move the scope check into a Mongo $or filter that captures every scope the caller can see in one go (scope=user iff requested_by == caller, scope=project iff scope_id in caller's projects, scope=team iff in caller's team list, scope=global iff analytics:global or system:manage). Pass it as ComplianceReportRepository.list's new extra_filter kwarg so skip/limit run on the already-restricted set and pages always return up to limit accessible reports. The fake DB used by integration tests didn't recognise dotted field paths ('members.user_id'), so ScopeResolver returned no projects in those tests. Extend the fake _fake_match_doc with a recursive _resolve_dotted helper that walks lists too — matching real Mongo semantics — so the visibility branch resolves correctly under tests.

When SHA-1 (or any algorithm flagged by both BSI TR-02102 and NIST SP 800-131A, etc.) appears in a CBOM, both rules matched the same asset and the analyzer emitted two separate findings — inflating findings_count, severity_mix, and dashboard counters by 2x for every multi-framework hit. Group rules per asset before building findings: emit one finding per (asset, finding_type), keep the strictest severity as the lead, and record every matched rule under details.matched_rules so compliance evaluators and audit views still see the full cross-framework agreement. References are deduplicated and merged in the same step. Verified live: a CBOM containing SHA-1 now produces 1 finding (not 2) with both bsi-02102-sha1-deprecated and nist-131a-sha1 listed under matched_rules.

morzan1001 added 30 commits April 20, 2026 16:35

feat: add crypto finding types to FindingType enum

dd94717

feat: add ParsedCryptoAsset and ParsedCBOM schemas

cd49660

fix: use direct import for ParsedCryptoAsset in sbom.py

e6b3122

feat: add CryptoAsset MongoDB model

ae3178b

refactor: add Field descriptions to CryptoAsset matching Dependency p…

caada6e

…attern

feat: add CryptoPolicy model and CryptoRule schema

1530fba

feat: add CryptoAssetRepository with bulk upsert and summary

cc197db

feat: add CryptoPolicyRepository for system and project scopes

87e37c9

feat: add CBOM parser with fail-soft handling of CycloneDX 1.6

2dedae3

feat: extract cryptographic-asset components in SBOM parser

3237bbd

feat: add /api/v1/ingest/cbom endpoint

0cfffe9

feat: persist crypto assets from embedded CBOM in SBOM ingest

e7131ac

feat: add CryptoRule matcher with AND-semantics and glob support

b14f965

feat: add CryptoPolicyResolver with per-instance cache

4967369

feat: add CryptoRuleAnalyzer emitting crypto findings from policy

5bac024

feat: register three crypto analyzers in the analysis registry

9318e33

feat: dispatch crypto analyzers from engine and CBOM ingest

c552be9

test: verify project policy override suppresses crypto findings

e0459ee

test: verify existing waivers apply to new crypto finding types

830df27

test: verify MAX_CRYPTO_ASSETS_PER_SCAN truncates assets

d4c3a14

feat: add MCP tools for crypto assets and policy

90a36fb

feat: fire crypto_asset.ingested webhook event after CBOM ingest

2808947

feat(frontend): add crypto types and API client

06fbf34

feat(frontend): add CryptoSummaryHeader with tanstack-query

d4f3c2c

feat(frontend): add CryptoAssetTable with filters and pagination

8211c2a

feat(frontend): add Cryptography tab with inventory, detail drawer, a…

43f3524

…nd findings view

feat: add admin and project crypto-policy REST endpoints

c4d5455

morzan1001 and others added 29 commits April 21, 2026 16:35

docs: document CBOM support and pipeline templates

8e3e995

Merge remote-tracking branch 'origin/main' into feature/cbom-phase1

699713b

# Conflicts: # backend/poetry.lock # frontend/package.json # frontend/pnpm-lock.yaml

version bump

9887828

morzan1001 merged commit 7f8890e into main Apr 29, 2026
6 of 7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CBoms #51

CBoms #51
morzan1001 merged 63 commits intomainfrom
feature/cbom-phase1

morzan1001 commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

morzan1001 commented Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant